percentage increase
End-to-end Deep Reinforcement Learning for Stochastic Multi-objective Optimization in C-VRPTW
Abouelrous, Abdo, Bliek, Laurens, Wu, Yaoxin, Zhang, Yingqian
In this work, we consider learning-based applications in routing to solve a Vehicle Routing variant characterized by stochasticity and multiple objectives. Such problems are representative of practical settings where decision-makers have to deal with uncertainty in the operational environment as well as multiple conflicting objectives due to different stakeholders. We specifically consider travel time uncertainty. We also consider two objectives, total travel time and route makespan, that jointly target operational efficiency and labor regulations on shift length, although different objectives could be incorporated. Learning-based methods offer earnest computational advantages as they can repeatedly solve problems with limited interference from the decision-maker. We specifically focus on end-to-end deep learning models that leverage the attention mechanism and multiple solution trajectories. These models have seen several successful applications in routing problems. However, since travel times are not a direct input to these models due to the large dimensions of the travel time matrix, accounting for uncertainty is a challenge, especially in the presence of multiple objectives. In turn, we propose a model that simultaneously addresses stochasticity and multi-objectivity and provide a refined training mechanism for this model through scenario clustering to reduce training time. Our results show that our model is capable of constructing a Pareto Front of good quality within acceptable run times compared to three baselines.
Rescheduling after vehicle failures in the multi-depot rural postman problem with rechargeable and reusable vehicles
Sathyamurthy, Eashwar, Herrmann, Jeffrey W., Azarm, Shapour
We present a centralized auction algorithm to solve the Multi-Depot Rural Postman Problem with Rechargeable and Reusable Vehicles (MD-RPP-RRV), focusing on rescheduling arc routing after vehicle failures. The problem involves finding heuristically obtained best feasible routes for multiple rechargeable and reusable vehicles with capacity constraints capable of performing multiple trips from multiple depots, with the possibility of vehicle failures. Our algorithm auctions the failed trips to active (non-failed) vehicles through local auctioning, modifying initial routes to handle dynamic vehicle failures efficiently. When a failure occurs, the algorithm searches for the best active vehicle to perform the failed trip and inserts the trip into that vehicle's route, which avoids a complete rescheduling and reduces the computational effort. We compare the algorithm's solutions against offline optimal solutions obtained from solving a Mixed Integer Linear Programming (MILP) formulation using the Gurobi solver; this formulation assumes that perfect information about the vehicle failures and failure times is given. The results demonstrate that the centralized auction algorithm produces solutions that are, in some cases, near optimal; moreover, the execution time for the proposed approach is much more consistent and is, for some instances, orders of magnitude less than the execution time of the Gurobi solver. The theoretical analysis provides an upper bound for the competitive ratio and computational complexity of our algorithm, offering a formal performance guarantee in dynamic failure scenarios.
HLOB -- Information Persistence and Structure in Limit Order Books
Briola, Antonio, Bartolucci, Silvia, Aste, Tomaso
Their complexity stems from two main factors: (i) the interaction of a large number of agents pursuing heterogeneous goals at different time scales through the implementation of trading strategies designed to leverage asymmetric information; (ii) the emergence of selforganizing collective behaviors that do not result from the existence of any central controller and are therefore difficult to anticipate. The concurrence of these aspects contributes to the sporadic and limited-in-time persistence of inefficiencies that make the trading practice profitable. The analysis of existing inefficiencies and the forecasting of new ones is made possible by the mathematical and statistical modeling of the time series reflecting the financial market's behavior. The granularity of these time series widely varies depending on the goal of the analysis, and, in the high-frequency case (i.e., the scenario we are mainly interested in), it can be order-driven with a resolution up to the nanosecond [31]. Indeed, the majority of modern financial exchanges store order-level updates in data structures known as Limit Order Books (LOBs).
Complexity-calibrated Benchmarks for Machine Learning Reveal When Next-Generation Reservoir Computer Predictions Succeed and Mislead
Marzen, Sarah E., Riechers, Paul M., Crutchfield, James P.
Recurrent neural networks are used to forecast time series in finance, climate, language, and from many other domains. Reservoir computers are a particularly easily trainable form of recurrent neural network. Recently, a "next-generation" reservoir computer was introduced in which the memory trace involves only a finite number of previous symbols. We explore the inherent limitations of finite-past memory traces in this intriguing proposal. A lower bound from Fano's inequality shows that, on highly non-Markovian processes generated by large probabilistic state machines, next-generation reservoir computers with reasonably long memory traces have an error probability that is at least ~ 60% higher than the minimal attainable error probability in predicting the next observation. More generally, it appears that popular recurrent neural networks fall far short of optimally predicting such complex processes. These results highlight the need for a new generation of optimized recurrent neural network architectures. Alongside this finding, we present concentration-of-measure results for randomly-generated but complex processes. One conclusion is that large probabilistic state machines -- specifically, large $\epsilon$-machines -- are key to generating challenging and structurally-unbiased stimuli for ground-truthing recurrent neural network architectures.
Analysis of Reinforcement Learning for determining task replication in workflows
McGough, Andrew Stephen, Forshaw, Matthew
Executing workflows on volunteer computing resources where individual tasks may be forced to relinquish their resource for the resource's primary use leads to unpredictability and often significantly increases execution time. Task replication is one approach that can ameliorate this challenge. This comes at the expense of a potentially significant increase in system load and energy consumption. We propose the use of Reinforcement Learning (RL) such that a system may `learn' the `best' number of replicas to run to increase the number of workflows which complete promptly whilst minimising the additional workload on the system when replicas are not beneficial. We show, through simulation, that we can save 34% of the energy consumption using RL compared to a fixed number of replicas with only a 4% decrease in workflows achieving a pre-defined overhead bound.
Recommending Groups to Users Using User-Group Engagement and Time-Dependent Matrix Factorization
Wang, Xin (Simon Fraser University) | Donaldson, Roger (The University of British Columbia) | Nell, Christopher (DeviantArt, Inc.) | Gorniak, Peter (DeviantArt, Inc.) | Ester, Martin (Simon Fraser University) | Bu, Jiajun (Zhejiang University)
Social networks often provide group features to help users with similar interests associate and consume content together. Recommending groups to users poses challenges due to their complex relationship: user-group affinity is typically measured implicitly and varies with time; similarly, group characteristics change as users join and leave. To tackle these challenges, we adapt existing matrix factorization techniques to learn user-group affinity based on two different implicit engagement metrics: (i) which group-provided content users consume; and (ii) which content users provide to groups. To capture the temporally extended nature of group engagement we implement a time-varying factorization. We test the assertion that latent preferences for groups and users are sparse in investigating elastic-net regularization. Experiments using data from DeviantArt indicate that the time-varying implicit engagement-based model provides the best top-K group recommendations, illustrating the benefit of the added model complexity.